77 research outputs found

    An O(n^{2.75}) algorithm for online topological ordering

    Full text link
    We present a simple algorithm which maintains the topological order of a directed acyclic graph with n nodes under an online edge insertion sequence in O(n^{2.75}) time, independent of the number of edges m inserted. For dense DAGs, this is an improvement over the previous best result of O(min(m^{3/2} log(n), m^{3/2} + n^2 log(n)) by Katriel and Bodlaender. We also provide an empirical comparison of our algorithm with other algorithms for online topological sorting. Our implementation outperforms them on certain hard instances while it is still competitive on random edge insertion sequences leading to complete DAGs.Comment: 20 pages, long version of SWAT'06 pape

    Learning to Prune Instances of Steiner Tree Problem in Graphs

    Full text link
    We consider the Steiner tree problem on graphs where we are given a set of nodes and the goal is to find a tree sub-graph of minimum weight that contains all nodes in the given set, potentially including additional nodes. This is a classical NP-hard combinatorial optimisation problem. In recent years, a machine learning framework called learning-to-prune has been successfully used for solving a diverse range of combinatorial optimisation problems. In this paper, we use this learning framework on the Steiner tree problem and show that even on this problem, the learning-to-prune framework results in computing near-optimal solutions at a fraction of the time required by commercial ILP solvers. Our results underscore the potential of the learning-to-prune framework in solving various combinatorial optimisation problems

    Traversing large graphs in realistic settings

    Get PDF
    The notion of graph traversal is of fundamental importance to solving many computational problems. In many modern applications involving graph traversal such as those arising in the domain of social networks, Internet based services, fraud detection in telephone calls etc., the underlying graph is very large and dynamically evolving. This thesis deals with the design and engineering of First Search (BFS) algorithms for massive sparse undirected graphs. Our pipelined implementations with low constant factors, together with some heuristics preserving the worst-case guarantees makes BFS viable on massive graphs. We perform an extensive set of experiments to study the effect of various graph properties such as diameter, inititraversal algorithms for such graphs. We engineer various I/O-efficient Breadth al disk layouts, tuning parameters, disk parallelism, cache-obliviousness etc. on the relative performance of these algorithms. We characterize the performance of NAND flash based storage devices, including many solid state disks. We show that despite the similarities between flash memory and RAM (fast random reads) and between flash disk and hard disk (both are block based devices), the algorithms designed in the RAM model or the external memory model do not realize the full potential of the flash memory devices. We also analyze the effect of misalignments, aging, past I/O patterns, etc. on the performance obtained on these devices. We also consider I/O-efficient BFS algorithms for the case when a hard disk and a solid state disk are used together. We present a simple algorithm which maintains the topological order of a directed acyclic graph with n nodes under an online edge insertion sequence in O(n2.75)time, independent of the number m of edges inserted. For dense DAGs, this is an improvement over the previous best result of O (min{m3/2 logn,m3/2 +n2 logn}). While our analysis holds only for the incremental setting, our algorithm itself is fully dynamic. We also present the first average-case analysis of online topological ordering algorithms. We prove an expected runtime of O (n2 polylog(n)) under insertion of the edges of a complete DAG in a random order for various incremental topological ordering algorithms.Die Traversierung von Graphen ist von fundamentaler Bedeutung fĂŒr das Lösen vieler Berechnungsprobleme. Moderne Anwendungen, die auf Graphtraversierung beruhen, findet man unter anderem in sozialen Netzwerken, internetbasierten Dienstleistungen, Betrugserkennung bei Telefonanrufen. In vielen dieser lAnwendungen ist der zugrunde iegende Graph sehr gross und Ă€ndert sich kontinuierlich. Wir entwickelnmehrere I/O-effiziente Breitensuch-Algorithmen fĂŒr massive, dĂŒnnbesiedelte, ungerichtete Graphen. Im Zusammenspiel mit Heuristiken zur Einhaltung von Worst-Case-Garantien, ermöglichen unsere pipeline-basierten Implementierungen die PraktikabilitĂ€t von Breitensuche auf massiven Graphen. Wir fĂŒhren eine Vielfalt an Experimente durch, um die Wirkung unterschiedlicher Grapheigenschaften zu untersuchen, wie z.B. Graph-Durchmesser, anfĂ€ngliche Belegung der Festplatte, Tuning-Parameter, Plattenparallelismus. Wir charakterisieren die Leistung von NAND-Flash basierten Speichermedien, einschliesslich vieler solid-state Disks. Wir zeigen, dass trotz der Ähnlichkeiten von Flash-Speicher und RAM (schnelle wahlfreie Lese-Zugriffe) und von Flash-Platten und Festplatten (beide sind blockbasiert) Algorithmen, die fĂŒr das RAMModell oder das Externspeicher-Modell entworfenen wurden, nicht das volle Potential der Flash-Speicher-Medien ausschöpfen. ZusĂ€tzlich analysieren wir die Wirkung von Ausrichtungsfehlern, Alterung, vorausgehenden I/O-Mustern, usw., auf die Leistung dieser Medien. Wir berĂŒcksichtigen auch I/O-effiziente Breitensuch-Algorithmen fĂŒr die gleichzeitige Nutzung von Festplatten und solid-state Disks. Wir stellen einen einfachen Algorithmus vor, der beim Online-EinfĂŒgen von Kanten die topologische Ordnung von einem gerichteten, azyklischen Graphen (DAG) mit n Knoten beibehĂ€lt. Dieser Algorithmus hat eine LaufzeitkomplexitĂ€t von O(n2.75) unabhĂ€ngig von der Anzahl m der eingefĂŒgten Kanten. FĂŒr dichte DAGs ist dies eine Verbesserung des besten, vorherigen Ergebnisses von O(min{m3/2 logn,m3/2 +n2 logn}). WĂ€hrend die Analyse nur im inkrementellen Szenario gĂŒtlig ist, ist unser Algorithmus vollstĂ€ndig dynamisch. Ferner stellen wir die erste Average-Case-Analyse von Online-Algorithmen zur Unterhaltung einer topologischen Ordnung vor. FĂŒr mehrere inkrementelle Algorithmen, welche die Kanten eines kompletten DAGs in zufĂ€lliger Reihenfolge einfĂŒgen, beweisen wir eine erwartete Laufzeit von O(n2 polylog(n))

    Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs

    Full text link
    Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to nd the top-k matches ac- cording to a ranking function over edge and node weights. For users, it is di cult to select value k . We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, re- turn as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continues until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.Comment: To appear in WWW 201

    Learning to Sparsify Travelling Salesman Problem Instances

    Get PDF
    CPAIOR 2021: 18th International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Vienna, Austria, 5 - 8 July 2021In order to deal with the high development time of exact and approximation algorithms for NP-hard combinatorial optimisation problems and the high running time of exact solvers, deep learning techniques have been used in recent years as an end-to-end approach to find solutions. However, there are issues of representation, generalisation, complex architectures, interpretability of models for mathematical analysis etc. using deep learning techniques. As a compromise, machine learning can be used to improve the run time performance of exact algorithms in a matheuristics framework. In this paper, we use a pruning heuristic leveraging machine learning as a pre-processing step followed by an exact Integer Programming approach. We apply this approach to sparsify instances of the classical travelling salesman problem. Our approach learns which edges in the underlying graph are unlikely to belong to an optimal solution and removes them, thus sparsifying the graph and significantly reducing the number of decision variables. We use carefully selected features derived from linear programming relaxation, cutting planes exploration, minimum-weight spanning tree heuristics and various other local and statistical analysis of the graph. Our learning approach requires very little training data and is amenable to mathematical analysis. We demonstrate that our approach can reliably prune a large fraction of the variables in TSP instances from TSPLIB/MATILDA (>85%) while preserving most of the optimal tour edges. Our approach can successfully prune problem instances even if they lie outside the training distribution, resulting in small optimality gaps between the pruned and original problems in most cases. Using our learning technique, we discover novel heuristics for sparsifying TSP instances, that may be of independent interest for variants of the vehicle routing problem.Science Foundation IrelandOpen access funding provided by SF

    Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries

    Get PDF
    We study ranked enumeration of join-query results according to very general orders defined by selective dioids. Our main contribution is a framework for ranked enumeration over a class of dynamic programming problems that generalizes seemingly different problems that had been studied in isolation. To this end, we extend classic algorithms that find the k-shortest paths in a weighted graph. For full conjunctive queries, including cyclic ones, our approach is optimal in terms of the time to return the top result and the delay between results. These optimality properties are derived for the widely used notion of data complexity, which treats query size as a constant. By performing a careful cost analysis, we are able to uncover a previously unknown tradeoff between two incomparable enumeration approaches: one has lower complexity when the number of returned results is small, the other when the number is very large. We theoretically and empirically demonstrate the superiority of our techniques over batch algorithms, which produce the full result and then sort it. Our technique is not only faster for returning the first few results, but on some inputs beats the batch algorithm even when all results are produced.Comment: 50 pages, 19 figure
    • 

    corecore